Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: tag filter move to pkg:filters #5042

Closed

Conversation

ivankatliarchuk
Copy link
Contributor

@ivankatliarchuk ivankatliarchuk commented Jan 28, 2025

Description

Rationale, currently filters reside in providers package, there is nothing wrong, but worth to move them to it's own specific package. This should simplify maintenance, isolate testing and etc.

For AWS as example, there default limits are 500 zones with 50 tags attached to each zone. For that reason, added . Google cloud resources could have up-to 64 labels and 100 zones.

  • benchmark tests, so that if we decide to tweak filtering algorithm, there is a baseline
  • tags filter pre-processing on creation of the filter, so that filtering only responsible for one thing
  • first filter moved to pkg/filters/<filtername>
  • create file pkg/filter/filters.go with plan to expose filters with type aliases

By moving user input tag pre-processing away from the actual filter logic, the aim is to improve separation of concerns and try to boost efficiency

Before > BenchmarkZoneTagFilterMatchBasic-16    2145244      563.6 ns/op

After > BenchmarkZoneTagFilterMatchBasic-16      3024049.    406.9 ns/op

Before > BenchmarkZoneTagFilterComplex-16    	  333133	      3449 ns/op

After > BenchmarkZoneTagFilterComplex-16    	  765247	      1441 ns/op

follow-up:

  • move other filters, enhance with tests
  • introduce interface
  • we could have better coverage in regards edge-cases that currently not covered if any

Checklist

  • Unit tests updated
  • End user documentation updated

Just a bit more information about the why bit. I'm going over issues, trying to filter them, resolve where I could. Some of them related to multiple filters, at the moment there is an inconsistent behaviour among providers. So I started with documentation here. Docs reside on here https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/aws.md#strategies-for-scoping-zones for now it's only AWS, as it's easier to validate for me, as I have quite cool setup there.

So the plan was to go over all of the filters, understand behaviour and move docs to some central place, so that we have a clear definition of all available filsters.

Signed-off-by: ivan katliarchuk <[email protected]>
* master: (23 commits)
  Fix spelling in webhook OpenAPI spec
  fix(docs): aws tutorial broken internal markdown links
  rollback to if checks
  fix(aws-provider): ListTagsForResource incorrect zone-id handling
  updated link
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  chore: improve canonicalHostedZone alog improvement. cover case when hosted zone not yet added to file
  chore: improve canonicalHostedZone alog improvement. add comment
  chore: improve canonicalHostedZone alog improvement
  chore: improve canonicalHostedZone handling
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. docs updated
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. added tests
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. added tests
  fix(aws-provider): ListTagsForResource incorrect zone-id handling. added tests
  fix(aws-provider): ListTagsForResource incorrect zone-id handling
  ...
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign szuecs for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 28, 2025
@k8s-ci-robot k8s-ci-robot requested a review from szuecs January 28, 2025 16:47
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 28, 2025
@ivankatliarchuk ivankatliarchuk marked this pull request as draft January 28, 2025 16:48
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 28, 2025
Signed-off-by: ivan katliarchuk <[email protected]>
@ivankatliarchuk
Copy link
Contributor Author

/test pull-external-dns-licensecheck

Signed-off-by: ivan katliarchuk <[email protected]>
@ivankatliarchuk
Copy link
Contributor Author

/label tide/merge-method-squash
/label kind/design

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 28, 2025
@k8s-ci-robot
Copy link
Contributor

@ivankatliarchuk: The label(s) /label kind/design cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor, ci-short, ci-extended, ci-full. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/label tide/merge-method-squash
/label kind/design

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ivankatliarchuk
Copy link
Contributor Author

/kind design

@k8s-ci-robot k8s-ci-robot added the kind/design Categorizes issue or PR as related to design. label Jan 28, 2025
Signed-off-by: ivan katliarchuk <[email protected]>
@ivankatliarchuk ivankatliarchuk marked this pull request as ready for review January 28, 2025 18:52
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 28, 2025
Copy link

@Dadeos-Menlo Dadeos-Menlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that it's my place to comment; but personally I struggle to appreciate the benefits of the proposed factoring out of the ZoneTagFilter into a separate "filters" package.

It strikes me that the ZoneTagFilter is currently only used by the AWS provider. So, if anything, I'd propose relocating it to be part of the "aws" provider's sub-directory. It appears to be the case that the ZoneTagFilter is a work-around for the fact that, unfortunately, it appears that the AWS Route53 API does not support server-side filtering of results by the desired tags. Also, what makes a "ZoneTag" specific to "Zones", the proposed behaviour - which I appreciate is entirely based upon the pre-existing implementation - appears to be fairly generic (i.e. pertains to "tags" in general, as opposed to "ZoneTags" in particular).

// ZoneTagFilter holds a list of zone tags to filter by
type ZoneTagFilter struct {
zoneTags []string
zoneTags []string

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the introduction of zoneTagsMap, a pre-processed map of key/value pairs described by zoneTags, zoneTags has become redundant and should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

I was thinking to rename ZoneTagFilter to TagFilter, will see what mloiseleur view is on it.

value := strings.TrimSpace(parts[1])
z.zoneTagsMap[key] = value
} else {
z.zoneTagsMap[key] = ""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding is correctly, I believe that this represents a logic flaw that may, or may not be, significant in this context…

My understanding is that a tag is a string conforming to one of the following forms:

  • "key" - a simply identifier tag
  • "key=value" - an identifier tag with an associated value constraint

which, by extension, also includes the form "key="; the question being, should "key=" be considered equivalent to "key"?

If the answer is yes, they are to be considered equivalent, then the proposed implementation is acceptable. If the answer is no, they represent distinct configurations, then the zoneTagsMaps should be defined as map[string][]string, where the value can either be nil (i.e. no value), or a string-slice (e.g. that may have zero length).

In any case, there should be unit-test cases defining such scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go it. So you mean the tag could potentially contain a special character or not. If we take all major cloud hyperscalers, the tag value do not support = in tag name, only in value, this my understanding . Yep, will create a test for that case

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really about a "special value"; my question relates to whether there are two flavours of tag, one with a value (i.e. which could be an empty string) and one without, or whether the representation without a value (i.e. "tag") is considered equivalent to a tag specified with an empty value (i.e. "tag=").

The proposed implementation treats a tag specified without a value (i.e. "tag") and a tag specified with an empty value (i.e. "tag=") as being equivalent; which in this context may be correct, but there are potential scenarios where the two should be considered distinct (i.e. the Match(…) function should return false).

Something like:

NewZoneTagFilter([]string{"tag"}).Match(map[string]string{"tag": ""})
NewZoneTagFilter([]string{"tag="}).Match(map[string]string{"tag": ""})

Upon closer inspection, in answer to my own question, there are not two flavours of tag because the prototype of the Match(tagsMap map[string]string) precludes the presentation of the without-value case; which would have to be something like:

NewZoneTagFilter([]string{"tag"}).Match(map[string]string{"tag": nil}) == true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying not to check behaviour too much. This are the current tests

func TestZoneTagFilterMatch(t *testing.T) {

It could not be {"tag": nil} as the map is map[string]string which will allow empty but not nill. Which test is missing in your opinion then?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed; ignore me - my understanding was incomplete, "tag" and "tag=" are considered equivalent.

@ivankatliarchuk
Copy link
Contributor Author

Hi @Dadeos-Menlo . This is an existing tag implementation https://github.com/kubernetes-sigs/external-dns/blob/9619e6b1c3898dc7fa6812d70af753235ee12f74/provider/zone_tag_filter.go I'm just moving it out. I could have a look what is the difference between current zone_tag_filter and zone_type_filter, zone type filter validate where or not zone is public or private, not related to tags.

At the moment only AWS is supported, Azure do not support fetch by tags I could be wrong here. https://github.com/Azure/azure-sdk-for-go/blob/051993c29d9e2b0c7bd4997407e6401af5c28380/sdk/resourcemanager/privatedns/armprivatedns/options.go#L98 and public https://github.com/Azure/azure-sdk-for-go/blob/051993c29d9e2b0c7bd4997407e6401af5c28380/sdk/resourcemanager/dns/armdns/options.go#L72

GCP does not support fetch DNS zones by tags, but could filter by labels. Not too sure, correct me here.

Oracle cloud does not support filtering by tags https://docs.cloud.oracle.com/en-us/iaas/tools/go-sdk-examples/latest/dns/ListZones.go.html

Cloudflare not yet supports tags, but there are plans in the future to add them.
Akamai supports tags, but client could not request zones with certain tags.

Given the potential for future support of other providers, I believe it's beneficial to have it's own package. This will improve the flexibility and maintainability of the codebase in the long run.

The initial was to move all filters to the folder, add better test coverage, where required add benchmark tests as well.

Then create an interface, to make sure all filters follow the same pattern.

@ivankatliarchuk
Copy link
Contributor Author

If you think there are no benefits, fine by me. I'll close this pull request.

@Dadeos-Menlo
Copy link

If you think there are no benefits, fine by me. I'll close this pull request.

I'm not sure that I'd go as far as "there are no benefits" but it strikes me that the trajectory of the proposed change is to seek to provide a consistent tag filtering behaviour across all providers. Whilst a noble endeavour, it seems unlikely that all providers will support tags, and therefore the only way of homogenising tag filtering behaviour across all providers would be to drop support for filtering by tags. I'm also somewhat sceptical as to the need for filtering zones by tags; one would have to have sufficient zones to make managing them challenging, but not want to process all zones within a given provider, and not be sufficiently motivated to explicitly specify the domains to manage. Clearly, someone cared enough about the ability to implement it, for the AWS provider, but I suspect that there are other, higher impact, issues that would benefit from attention.

Whilst the current implementation is named ZoneTagFilter, it's really an aws-zone-tags filter, which could/should have originally been implemented as a generic tag filter but for whatever reason wasn't.

The ZoneTypeFilter demonstrates the challenges of attempting a faux generic implementation, in that it's acquired a bunch of AWS specific behaviour. I suggest that any functionality that is placed outside of the sigs.k8s.io/external-dns/pkg/providers package should be applicable to any/all providers; hence my observation that the tag filtering is more AWS-specific (i.e. belongs beneath sigs.k8s.io/external-dns/pkg/providers) than it is provider-agnostic (i.e. a candidate for relocating to the proposed, new, sigs.k8s.io/external-dns/pkg/filters package).

@ivankatliarchuk
Copy link
Contributor Author

Tag support varies across providers drastically, most of them do not have such a concept. AWS hyperscaler is the most common, hence it has bit more support in external-dns project, example only AWS have other registires implemented. Organisations of multi-tenant environments is simplified on AWS as well. Other providers have completely different usage model. Implementing new features in an open-source project requires community participation. If we want to add tag support for other providers, we'll need someone with the necessary expertise and willingness to contribute their time and effort.

So my assumption, the tag filter may only be aws specific, aka support of tags/labels for other providers. Simple math, ~90% of external-dns users are running simple workfloads, majority of them do not even need tags.

So code organisation could be different.

At the moment all filters are like butter spread all over. Nothing wrong, but harder to see patterns and improvements points, not just provider filters. Lack of common interface, lack of common behaviour, different level of test coverage, missing documentation, missing specs, as a result behaviour differ.

So the plan was to start centralising filters, documenting they behaviour and at some point to create a specification for filters. The work is purely based on reviewing issues and trying to make some of the things clear.

Currently, this specific filter is exclusively used within the AWS environment, but has no coupling with AWS, is generic. Its placement could either remain within the general "providers" directory or be moved to the "providers/aws" subdirectory or pkg/filters. Hence moving to provider/aws will push away from initial plan why this whole work started.

/area code-organization

@k8s-ci-robot
Copy link
Contributor

@ivankatliarchuk: The label(s) area/code-organization cannot be applied, because the repository doesn't have them.

In response to this:

Tag support varies across providers drastically, most of them do not have such a concept. AWS hyperscaler is the most common, hence it has bit more support in external-dns project, example only AWS have other registires implemented. Organisations of multi-tenant environments is simplified on AWS as well. Other providers have completely different usage model. Implementing new features in an open-source project requires community participation. If we want to add tag support for other providers, we'll need someone with the necessary expertise and willingness to contribute their time and effort.

So my assumption, the tag filter may only be aws specific, aka support of tags/labels for other providers. Simple math, ~90% of external-dns users are running simple workfloads, majority of them do not even need tags.

So code organisation could be different.

At the moment all filters are like butter spread all over. Nothing wrong, but harder to see patterns and improvements points, not just provider filters. Lack of common interface, lack of common behaviour, different level of test coverage, missing documentation, missing specs, as a result behaviour differ.

So the plan was to start centralising filters, documenting they behaviour and at some point to create a specification for filters. The work is purely based on reviewing issues and trying to make some of the things clear.

Currently, this specific filter is exclusively used within the AWS environment, but has no coupling with AWS, is generic. Its placement could either remain within the general "providers" directory or be moved to the "providers/aws" subdirectory or pkg/filters. Hence moving to provider/aws will push away from initial plan why this whole work started.

/area code-organization

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 4, 2025
@ivankatliarchuk
Copy link
Contributor Author

After talking with the owners, graduation to beta related work is the priority. Closing this for now.

@ivankatliarchuk ivankatliarchuk deleted the chore-tags-filter branch February 16, 2025 15:40
@ivankatliarchuk ivankatliarchuk restored the chore-tags-filter branch February 16, 2025 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants